AITopics | exponential function

Collaborating Authors

exponential function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation

Neural Information Processing SystemsMar-22-2026, 16:37:50 GMT

Softmax Loss (SL) is widely applied in recommender systems (RS) and has demonstrated effectiveness. This work analyzes SL from a pairwise perspective, revealing two significant limitations: 1) the relationship between SL and conventional ranking metrics like DCG is not sufficiently tight; 2) SL is highly sensitive to false negative instances. Our analysis indicates that these limitations are primarily due to the use of the exponential function. To address these issues, this work extends SL to a new family of loss functions, termed Pairwise Softmax Loss (PSL), which replaces the exponential function in SL with other appropriate activation functions. While the revision is minimal, we highlight three merits of PSL: 1) it serves as a tighter surrogate for DCG with suitable activation functions; 2) it better balances data contributions; and 3) it acts as a specific BPR loss enhanced by Distributionally Robust Optimization (DRO).

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.60)

Add feedback

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Ehsan Amid, Manfred K. K. Warmuth, Rohan Anil, Tomer Koren

Neural Information Processing SystemsFeb-12-2026, 21:36:45 GMT

Neural Information Processing Systems http://nips.cc/

divergence, logistic loss, loss function, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

OptimalApproximation-SmoothnessTradeoffsfor Soft-MaxFunctions

Neural Information Processing SystemsFeb-7-2026, 16:36:11 GMT

A soft-max function is a mechanism for choosing one out of a number of options, given the value of each option.

artificial intelligence, machine learning, soft-max function, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Expectation Propagation for t-Exponential Family Using q-Algebra

Futoshi Futami, Issei Sato, Masashi Sugiyama

Neural Information Processing SystemsNov-21-2025, 05:08:33 GMT

Exponential family distributions are highly useful in machine learning since their calculation can be performed efficiently through natural parameters. The exponential family has recently been extended to the t-exponential family, which contains Student-t distributions as family members and thus allows us to handle noisy data well. However, since the t-exponential family is defined by the deformed exponential, an efficient learning algorithm for the t-exponential family such as expectation propagation (EP) cannot be derived in the same way as the ordinary exponential family. In this paper, we borrow the mathematical tools of q-algebra from statistical physics and show that the pseudo additivity of distributions allows us to perform calculation of t-exponential family distributions through natural parameters. We then develop an expectation propagation (EP) algorithm for the t-exponential family, which provides a deterministic approximation to the posterior or predictive distribution with simple moment matching. We finally apply the proposed EP algorithm to the Bayes point machine and Student-t process classification, and demonstrate their performance numerically.

artificial intelligence, machine learning, t-exponential family, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States > Massachusetts (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

A General Constructive Upper Bound on Shallow Neural Nets Complexity

Hakl, Frantisek, Fojtik, Vit

arXiv.org Machine LearningOct-9-2025

We provide an upper bound on the number of neurons required in a shallow neural network to approximate a continuous function on a compact set with a given accuracy. This method, inspired by a specific proof of the Stone-Weierstrass theorem, is constructive and more general than previous bounds of this character, as it applies to any continuous function on any compact set.

2025-draft version, constructive upper bound, prague, (13 more...)

arXiv.org Machine Learning

2510.06372

Country:

Europe > Czechia > Prague (0.08)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

Ehsan Amid, Manfred K. K. Warmuth, Rohan Anil, Tomer Koren

Neural Information Processing SystemsOct-3-2025, 04:47:45 GMT

By tuning the two temperatures, we create loss functions that are non-convex already in the single layer case.

artificial intelligence, divergence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

the following results and discussions in the final version of the manuscript

Neural Information Processing SystemsOct-2-2025, 13:32:15 GMT

We greatly appreciate the three reviewers for their valuable comments. The following are our responses. Therefore, the cumulative hazard function also plays a crucial role in generating a median predictor. The performance is evaluated by the mean absolute error, summarized below. These results demonstrate the effectiveness of our model in the prediction task.

artificial intelligence, ct -lstm model, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

Add feedback

Mirror Descent Using the Tempesta Generalized Multi-parametric Logarithms

Cichocki, Andrzej

arXiv.org Machine LearningJun-18-2025

In this paper, we develop a wide class Mirror Descent (MD) algorithms, which play a key role in machine learning. For this purpose we formulated the constrained optimization problem, in which we exploits the Bregman divergence with the Tempesta multi-parametric deformation logarithm as a link function. This link function called also mirror function defines the mapping between the primal and dual spaces and is associated with a very-wide (in fact, theoretically infinite) class of generalized trace-form entropies. In order to derive novel MD updates, we estimate generalized exponential function, which closely approximates the inverse of the multi-parametric Tempesta generalized logarithm. The shape and properties of the Tempesta logarithm and its inverse-deformed exponential functions can be tuned by several hyperparameters. By learning these hyperparameters, we can adapt to distribution or geometry of training data, and we can adjust them to achieve desired properties of MD algorithms. The concept of applying multi-parametric logarithms allow us to generate a new wide and flexible family of MD and mirror-less MD updates.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2506.13984

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.14)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation

Neural Information Processing SystemsMay-27-2025, 19:03:35 GMT

pairwise perspective, recommendation, softmax loss, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.64)

Add feedback

Two-parameter superposable S-curves

S, Vijay Prakash

arXiv.org Artificial IntelligenceMay-7-2025

Straight line equation $y=mx$ with slope $m$, when singularly perturbed as $ay^3+y=mx$ with a positive parameter $a$, results in S-shaped curves or S-curves on a real plane. As $a\rightarrow 0$, we get back $y=mx$ which is a cumulative distribution function of a continuous uniform distribution that describes the occurrence of every event in an interval to be equally probable. As $a\rightarrow\infty$, the derivative of $y$ has finite support only at $y=0$ resembling a degenerate distribution. Based on these arguments, in this work, we propose that these S-curves can represent maximum entropy uniform distribution to a zero entropy single value. We also argue that these S-curves are superposable as they are only parametrically nonlinear but fundamentally linear. So far, the superposed forms have been used to capture the patterns of natural systems such as nonlinear dynamics of biological growth and kinetics of enzyme reactions. Here, we attempt to use the S-curve and its superposed form as statistical models. We fit the models on a classical dataset containing flower measurements of iris plants and analyze their usefulness in pattern recognition. Based on these models, we claim that any non-uniform pattern can be represented as a singular perturbation to uniform distribution. However, our parametric estimation procedure have some limitations such as sensitivity to initial conditions depending on the data at hand.

artificial intelligence, initial condition, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.19488

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback